Multi-Modal Music Information Retrieval - Visualisation and Evaluation of Clusterings by Both Audio and Lyrics

نویسندگان

  • Robert Neumayer
  • Andreas Rauber
چکیده

Navigation in and access to the contents of digital audio archives have become increasingly important topics in Information Retrieval. Both private and commercial music collections are growing both in terms of size and acceptance in the user community. Content based approaches relying on signal processing techniques have been used in Music Information Retrieval for some time to represent the acoustic characteristics of pieces of music, which may be used for collection organisation or retrieval tasks. However, music is not defined by acoustic characteristics only, but also, sometimes even to a large degree, by its contents in terms of lyrics. A song’s lyrics provide more information to search for or may be more representative of specific musical genres than the acoustic content, e.g. ‘love songs’ or ‘Christmas carols’. We therefore suggest an improved indexing of audio files by two modalities. Combinations of audio features and song lyrics can be used to organise audio collections and to display them via map based interfaces. Specifically, we use Self-Organising Maps as visualisation and interface metaphor. Separate maps are created and linked to provide a multi-modal view of an audio collection. Moreover, we introduce quality measures for quantitative validation of cluster spreads across the resulting multiple topographic mappings provided by the Self-Organising Maps.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MASTERARBEIT ATLANTIS Or Towards a Multi-Modal Approach to Music Information Retrieval and its Visualisation

Various aspects of the organisation of media archives and collections have produced eager interest in recent years. The Music Information Retrieval community has been gaining many insights into the area of abstract representations of music by means of audio signal processing. On top of that, recommendation engines are built to provide novel ways of creating playlists based on users’ preferences...

متن کامل

Multi-modal Analysis of Music: A large-scale Evaluation

Multimedia data by definition comprises several different types of content modalities. Music specifically inherits e.g. audio at its core, text in the form of lyrics, images by means of album covers, or video in the form of music videos. Yet, in many Music Information Retrieval applications, only the audio content is utilised. Recent studies have shown the usefulness of incorporating other moda...

متن کامل

Deep Cross-Modal Correlation Learning for Audio and Lyrics in Music Retrieval

Deep cross-modal learning has successfully demonstrated excellent performances in cross-modal multimedia retrieval, with the aim of learning joint representations between different data modalities. Unfortunately, little research focuses on cross-modal correlation learning where temporal structures of different data modalities such as audio and lyrics are taken into account. Stemming from the ch...

متن کامل

Music Genre Classification by Ensembles of Audio and Lyrics Features

Algorithms that can understand and interpret characteristics of music, and organise them for and recommend them to their users can be of great assistance in handling the ever growing size of both private and commercial collections. Music is an inherently multi-modal type of data, and the lyrics associated with the music are as essential to the reception and the message of a song as is the audio...

متن کامل

Toward Multi-modal Music Emotion Classification

The performance of categorical music emotion classification that divides emotion into classes and uses audio features alone for emotion classification has reached a limit due to the presence of a semantic gap between the object feature level and the human cognitive level of emotion perception. Motivated by the fact that lyrics carry rich semantic information of a song, we propose a multi-modal ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007